背景:患者的分类是控制2019年冠状病毒疾病的大流行病(Covid-19),特别是在临床资源极为有限时在大流行的峰值期间。目的:开发一种用合成胸CT自动筛分和量化肺和肺炎病变的方法,并评估Covid-19患者的疾病严重程度。材料和方法:在本研究中,我们通过可用的数据集(来自“肺结核分析2016年”的285个数据集“来生成数据增强以产生合成胸CT图像。合成图像和掩模用于训练2D U-Net神经网络并在203个Covid-19数据集上测试,以产生肺和病变分段。疾病严重程度评分(DL:损伤负荷; DS:损伤得分)是基于分段计算的。使用Pearson方法评估DL / DS和临床实验室测试之间的相关性。 p值<0.05被认为是统计显着性。结果:将自动肺和病变分段与手动注释进行比较。对于肺部分割,骰子相似系数,Jaccard指数和平均表面距离的中值分别为98.56%,97.15%和0.49 mm。病变分割的相同度量分别为76.95%,62.54%和2.36毫米。在DL / DS和百分比淋巴细胞检测中发现显着(P << 0.05)相关性,R值分别为-0.561和-0.501。结论:基于胸部射线照相和数据增强的AI系统对Covid-19患者的肺癌和病变进行了分段。成像结果与临床实验室测试之间的相关性表明该系统的价值作为评估Covid-19疾病严重程度的潜在工具。
translated by 谷歌翻译
在集成感测和通信(ISAC)系统中表征传感和通信性能权衡,在基于学习的人类运动识别的应用中具有挑战性。这是因为大型实验数据集和深神经网络的黑盒性质。本文介绍了SDP3,这是一种模拟驱动的性能预测指标和优化器,由SDP3数据模拟器,SDP3性能预测器和SDP3性能优化器组成。具体而言,SDP3数据模拟器在虚拟环境中生成生动的无线传感数据集,SDP3性能预测器预测基于函数回归方法的传感性能,而SDP3性能优化器会在分析上研究传感和通信性能。结果表明,模拟传感数据集在运动识别精度中非常匹配实验数据集。通过利用SDP3,发现可实现的识别准确性和通信吞吐量由通信饱和区组成,感应饱和区和通讯感应的对抗区域,ISAC系统的所需平衡性能位于第三个一。
translated by 谷歌翻译
In recent years, arbitrary image style transfer has attracted more and more attention. Given a pair of content and style images, a stylized one is hoped that retains the content from the former while catching style patterns from the latter. However, it is difficult to simultaneously keep well the trade-off between the content details and the style features. To stylize the image with sufficient style patterns, the content details may be damaged and sometimes the objects of images can not be distinguished clearly. For this reason, we present a new transformer-based method named STT for image style transfer and an edge loss which can enhance the content details apparently to avoid generating blurred results for excessive rendering on style features. Qualitative and quantitative experiments demonstrate that STT achieves comparable performance to state-of-the-art image style transfer methods while alleviating the content leak problem.
translated by 谷歌翻译
In recent years, the Transformer architecture has shown its superiority in the video-based person re-identification task. Inspired by video representation learning, these methods mainly focus on designing modules to extract informative spatial and temporal features. However, they are still limited in extracting local attributes and global identity information, which are critical for the person re-identification task. In this paper, we propose a novel Multi-Stage Spatial-Temporal Aggregation Transformer (MSTAT) with two novel designed proxy embedding modules to address the above issue. Specifically, MSTAT consists of three stages to encode the attribute-associated, the identity-associated, and the attribute-identity-associated information from the video clips, respectively, achieving the holistic perception of the input person. We combine the outputs of all the stages for the final identification. In practice, to save the computational cost, the Spatial-Temporal Aggregation (STA) modules are first adopted in each stage to conduct the self-attention operations along the spatial and temporal dimensions separately. We further introduce the Attribute-Aware and Identity-Aware Proxy embedding modules (AAP and IAP) to extract the informative and discriminative feature representations at different stages. All of them are realized by employing newly designed self-attention operations with specific meanings. Moreover, temporal patch shuffling is also introduced to further improve the robustness of the model. Extensive experimental results demonstrate the effectiveness of the proposed modules in extracting the informative and discriminative information from the videos, and illustrate the MSTAT can achieve state-of-the-art accuracies on various standard benchmarks.
translated by 谷歌翻译
Machine learning models are typically evaluated by computing similarity with reference annotations and trained by maximizing similarity with such. Especially in the bio-medical domain, annotations are subjective and suffer from low inter- and intra-rater reliability. Since annotations only reflect the annotation entity's interpretation of the real world, this can lead to sub-optimal predictions even though the model achieves high similarity scores. Here, the theoretical concept of Peak Ground Truth (PGT) is introduced. PGT marks the point beyond which an increase in similarity with the reference annotation stops translating to better Real World Model Performance (RWMP). Additionally, a quantitative technique to approximate PGT by computing inter- and intra-rater reliability is proposed. Finally, three categories of PGT-aware strategies to evaluate and improve model performance are reviewed.
translated by 谷歌翻译
We propose a novel approach to self-supervised learning of point cloud representations by differentiable neural rendering. Motivated by the fact that informative point cloud features should be able to encode rich geometry and appearance cues and render realistic images, we train a point-cloud encoder within a devised point-based neural renderer by comparing the rendered images with real images on massive RGB-D data. The learned point-cloud encoder can be easily integrated into various downstream tasks, including not only high-level tasks like 3D detection and segmentation, but low-level tasks like 3D reconstruction and image synthesis. Extensive experiments on various tasks demonstrate the superiority of our approach compared to existing pre-training methods.
translated by 谷歌翻译
Collaboration among industrial Internet of Things (IoT) devices and edge networks is essential to support computation-intensive deep neural network (DNN) inference services which require low delay and high accuracy. Sampling rate adaption which dynamically configures the sampling rates of industrial IoT devices according to network conditions, is the key in minimizing the service delay. In this paper, we investigate the collaborative DNN inference problem in industrial IoT networks. To capture the channel variation and task arrival randomness, we formulate the problem as a constrained Markov decision process (CMDP). Specifically, sampling rate adaption, inference task offloading and edge computing resource allocation are jointly considered to minimize the average service delay while guaranteeing the long-term accuracy requirements of different inference services. Since CMDP cannot be directly solved by general reinforcement learning (RL) algorithms due to the intractable long-term constraints, we first transform the CMDP into an MDP by leveraging the Lyapunov optimization technique. Then, a deep RL-based algorithm is proposed to solve the MDP. To expedite the training process, an optimization subroutine is embedded in the proposed algorithm to directly obtain the optimal edge computing resource allocation. Extensive simulation results are provided to demonstrate that the proposed RL-based algorithm can significantly reduce the average service delay while preserving long-term inference accuracy with a high probability.
translated by 谷歌翻译
The traditional statistical inference is static, in the sense that the estimate of the quantity of interest does not affect the future evolution of the quantity. In some sequential estimation problems however, the future values of the quantity to be estimated depend on the estimate of its current value. This type of estimation problems has been formulated as the dynamic inference problem. In this work, we formulate the Bayesian learning problem for dynamic inference, where the unknown quantity-generation model is assumed to be randomly drawn according to a random model parameter. We derive the optimal Bayesian learning rules, both offline and online, to minimize the inference loss. Moreover, learning for dynamic inference can serve as a meta problem, such that all familiar machine learning problems, including supervised learning, imitation learning and reinforcement learning, can be cast as its special cases or variants. Gaining a good understanding of this unifying meta problem thus sheds light on a broad spectrum of machine learning problems as well.
translated by 谷歌翻译
Most Graph Neural Networks follow the message-passing paradigm, assuming the observed structure depicts the ground-truth node relationships. However, this fundamental assumption cannot always be satisfied, as real-world graphs are always incomplete, noisy, or redundant. How to reveal the inherent graph structure in a unified way remains under-explored. We proposed PRI-GSL, a Graph Structure Learning framework guided by the Principle of Relevant Information, providing a simple and unified framework for identifying the self-organization and revealing the hidden structure. PRI-GSL learns a structure that contains the most relevant yet least redundant information quantified by von Neumann entropy and Quantum Jensen-Shannon divergence. PRI-GSL incorporates the evolution of quantum continuous walk with graph wavelets to encode node structural roles, showing in which way the nodes interplay and self-organize with the graph structure. Extensive experiments demonstrate the superior effectiveness and robustness of PRI-GSL.
translated by 谷歌翻译
Face forgery detection plays an important role in personal privacy and social security. With the development of adversarial generative models, high-quality forgery images become more and more indistinguishable from real to humans. Existing methods always regard as forgery detection task as the common binary or multi-label classification, and ignore exploring diverse multi-modality forgery image types, e.g. visible light spectrum and near-infrared scenarios. In this paper, we propose a novel Hierarchical Forgery Classifier for Multi-modality Face Forgery Detection (HFC-MFFD), which could effectively learn robust patches-based hybrid domain representation to enhance forgery authentication in multiple-modality scenarios. The local spatial hybrid domain feature module is designed to explore strong discriminative forgery clues both in the image and frequency domain in local distinct face regions. Furthermore, the specific hierarchical face forgery classifier is proposed to alleviate the class imbalance problem and further boost detection performance. Experimental results on representative multi-modality face forgery datasets demonstrate the superior performance of the proposed HFC-MFFD compared with state-of-the-art algorithms. The source code and models are publicly available at https://github.com/EdWhites/HFC-MFFD.
translated by 谷歌翻译